Basecalling using hidden Markov models

نویسندگان

  • Petros Boufounos
  • Sameh El-Difrawy
  • Dan Ehrlich
چکیده

In this paper we propose hidden Markov models to model electropherograms from DNA sequencing equipment and perform basecalling. We model the state emission densities using artificial neural networks, and modify the Baum–Welch reestimation procedure to perform training. Moreover, we develop a method that exploits consensus sequences to label training data, thus minimizing the need for hand labeling. We propose the same method for locating an electropherogram in a longer DNA sequence. We also perform a careful study of the basecalling errors and propose alternative HMM topologies that might further improve performance. Our results demonstrate the potential of these models. Based on these results, we conclude by suggesting further research directions. r 2003 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hidden Markov Models for Dna Sequencing

In this paper we propose Hidden Markov Models as an approach to the DNA basecalling problem. We model the state emission densities using Artificial Neural Networks, and provide a modified Baum-Welch re-estimation procedure to perform training. Moreover, we develop a method that exploits consensus sequences to label training data, thus minimizing the need for hand-labeling. Our results demonstra...

متن کامل

Introducing Busy Customer Portfolio Using Hidden Markov Model

Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...

متن کامل

مدل سازی فضایی-زمانی وقوع و مقدار بارش زمستانه در گستره ایران با استفاده از مدل مارکف پنهان

Multi site modeling of rainfall is one of the most important issues in environmental sciences especially in watershed management. For this purpose, different statistical models have been developed which involve spatial approaches in simulation and modeling of daily rainfall values. The hidden Markov is one of the multi-site daily rainfall models which in addition to simulation of daily rainfall...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003